OHI-Northeast | OHI Science | Citation policy
This script calculates catch by OHI region using landings data provided by NOAA>
Reference: [citation for source data; website, literature, contact information. Version of data (if relevant). Screenshots if a series of menus had to be navigated to obtain the data.]
Downloaded: April 3, 2019
Description: Commercial fish landings by statistical area
Time range: 1996-2017. Data provided annually
Format: Excel spreadsheet
Cleaning the raw data a bit by fixing column names and turning stat_area numeric.
raw <- read_excel(file.path(dir_anx, "_raw_data/NOAA_NMFS/catch_by_stat_area/Afflerbach_UCSB_Landings by Stat Area w Stock Name & Clam Trips_MAR 2019.xlsx"))
clean <- raw %>%
rename(year = YEAR,
stat_area = `STAT\r\nAREA`,
species = SPECIES,
pounds = `LBS LANDED \r\n(HAIL WT)`,
stock_id = `STOCK ID`,
stock = `STOCK NAME`) %>%
mutate(stat_area = as.numeric(stat_area))
head(clean)## # A tibble: 6 x 6
## year stat_area species pounds stock_id stock
## <dbl> <dbl> <chr> <dbl> <chr> <chr>
## 1 1996 0 CONFIDENTIAL SPECIES 11517161 <NA> <NA>
## 2 1996 462 COD 11827 CODGMSS GOM Cod
## 3 1996 462 CUSK 1065 <NA> <NA>
## 4 1996 462 FLOUNDER, AMERICAN PLAICE… 21191 PLAGMMA Plaice
## 5 1996 462 FLOUNDER, WITCH / GRAY SO… 5496 WITGMMA Witch Floun…
## 6 1996 462 HADDOCK 296 HADGM GOM Haddock
Since the data is provided by statistical landing area, we can use this information to infer what OHI region’s encompass or overlap with these areas. We have downloaded the shapefile for Statistical Areas from this public FTP NOAA site.
Load in the statistical areas and add area of each polygon as a column.
stat_shp <- sf::read_sf(file.path(dir_anx, "spatial/Statistical_Areas_2010_withNames.shp")) %>%
st_set_crs(p4s_nad83) %>% #set the CRS
st_transform(crs = crs(rgns)) #transform to our NE specific CRS
stat_shp$stat_area <- st_area(stat_shp) #add area as column
ggplot(stat_shp) +
geom_sf() +
theme_bw() +
labs(title = "Statistical areas") +
theme(legend.position = "none")Overlay statistical areas with our regions to find what ones are in our area
ohi_stat_areas <- st_intersection(rgns, stat_shp) #intersects statistical areas with OHI regions
ohi_stat_areas$ohi_area <- st_area(ohi_stat_areas) #calculate area of each overlapped polygon
ggplot(ohi_stat_areas) +
geom_sf(aes(fill = rgn_name)) +
geom_sf(data = ne_states, fill = "beige") +
theme_bw() +
labs(title = "Statistical areas overlapped with OHI regions") Calculate proportion of each statistical area in our OHI regions. For statistical areas that overlap with OHI regions, we can use proportional area overlap to adjust catch. We assume that catch is evenly distributed across each statistical area.
calc_prop_area <- ohi_stat_areas %>%
group_by(Id) %>%
mutate(ohi_rgn_prop_area = ohi_area/stat_area) #this column tells us how much of each OHI sub-region falls within the statistical area in our region
ggplot(calc_prop_area) +
geom_sf(aes(fill = ohi_rgn_prop_area)) +
geom_sf(data = ne_states, fill = "beige") +
theme_bw() +
labs(title = "Proportion of each \nstatistical area in OHI region") Now we calculate the total catch per species and year for each of the OHI regions.
First let’s filter the catch data to just the statistical areas in our region. We don’t care about the catch outside of these statistical areas.
region_catch <- clean %>%
filter(stat_area %in% ohi_stat_areas$Id) %>%
left_join(calc_prop_area, by = c("stat_area" = "Id")) %>%
mutate(catch = pounds*ohi_rgn_prop_area) %>% #adjusting catch by the proportional area with overlap
select(-area_km2, -FULL_NAME, -SHORT_NAME, -stat_area.y, -ohi_area, -NAFODIV) %>%
group_by(species, stock_id, stock, rgn_id, year, rgn_name) %>%
summarize(catch = sum(catch)) %>%
ungroup() %>%
mutate(display_name = ifelse(is.na(stock_id), species, stock_id))
head(region_catch, n = 20)## # A tibble: 20 x 8
## species stock_id stock rgn_id year rgn_name catch display_name
## <chr> <chr> <chr> <int> <dbl> <fct> <dbl> <chr>
## 1 ALEWIFE <NA> <NA> 3 1998 Gulf of Maine 2.63e3 ALEWIFE
## 2 ALEWIFE <NA> <NA> 3 1999 Gulf of Maine 4.34e1 ALEWIFE
## 3 ALEWIFE <NA> <NA> 3 2000 Gulf of Maine 2.41e2 ALEWIFE
## 4 ALEWIFE <NA> <NA> 3 2003 Gulf of Maine 4.66e1 ALEWIFE
## 5 ALEWIFE <NA> <NA> 3 2006 Gulf of Maine 0. ALEWIFE
## 6 ALEWIFE <NA> <NA> 3 2007 Gulf of Maine 0. ALEWIFE
## 7 ALEWIFE <NA> <NA> 4 2012 Mid-Atlantic Bi… 5.62e2 ALEWIFE
## 8 ALEWIFE <NA> <NA> 4 2013 Mid-Atlantic Bi… 6.40e3 ALEWIFE
## 9 ALEWIFE <NA> <NA> 4 2014 Mid-Atlantic Bi… 0. ALEWIFE
## 10 ALEWIFE <NA> <NA> 4 2015 Mid-Atlantic Bi… 0. ALEWIFE
## 11 ALEWIFE <NA> <NA> 4 2016 Mid-Atlantic Bi… 0. ALEWIFE
## 12 ALEWIFE <NA> <NA> 4 2017 Mid-Atlantic Bi… 0. ALEWIFE
## 13 ALEWIFE <NA> <NA> 5 2013 Connecticut 2.62e3 ALEWIFE
## 14 ALEWIFE <NA> <NA> 5 2014 Connecticut 0. ALEWIFE
## 15 ALEWIFE <NA> <NA> 5 2016 Connecticut 0. ALEWIFE
## 16 ALEWIFE <NA> <NA> 5 2017 Connecticut 0. ALEWIFE
## 17 ALEWIFE <NA> <NA> 6 1998 Maine 5.39e2 ALEWIFE
## 18 ALEWIFE <NA> <NA> 6 1999 Maine 8.90e0 ALEWIFE
## 19 ALEWIFE <NA> <NA> 6 2000 Maine 4.94e1 ALEWIFE
## 20 ALEWIFE <NA> <NA> 6 2003 Maine 9.56e0 ALEWIFE
p <- ggplot(region_catch, aes(x = year, y = catch, fill = display_name)) +
facet_wrap(~rgn_name, scales = "free_y") +
geom_bar(stat = "identity") +
theme_bw() +
theme(legend.position = 'none',
axis.text = element_text(size = 6),
axis.text.x = element_text(angle = 45))
plotly::ggplotly(p)Mapping just one species - Winter Flounder, George’s Bank
#filter just for 2017 winter flounder georges bank
wf <- region_catch %>%
filter(year == 2017, stock == "GB Winter Flounder") %>%
group_by(rgn_id, stock_id, stock, species, year) %>%
summarize(catch = sum(catch))
#join back to spatial info
wf_map <- rgns_simp %>%
left_join(wf, by = 'rgn_id')
#plot
ggplot(wf_map) +
geom_sf(aes(fill = catch))+
geom_sf(data = ne_states, fill = "beige") +
theme_bw() +
labs(title = "Winter Flounder - George's Bank")The data shared with us includes records of 0 catch. But there is still missing data. As an example, let’s look at ALEWIFE.
## [1] 1998 1999 2000 2003 2006 2007 2012 2013 2014 2015 2016 2017
Ok clearly we are missing data for 2001, 2002, 04-05, 2008-11. We don’t know if these are 0’s or missing data. We need to gapfill this missing data. When a species/state combination has missing data for a year, we can not assume it has a catch of 0. Since we calculate a rolling average of catch, NAs will remain as NA’s and the average will rely on just one or two years of catch. This is done to account for any wild fluctuations in catch year to year.
gf_data <- region_catch %>%
group_by(rgn_id, rgn_name, species, stock, stock_id, display_name) %>%
complete(year = 1998:2017) %>%
arrange(year) %>%
mutate(mean_catch = zoo::rollapply(catch, 3, mean, fill = NA, align = 'right')) %>% ## create a new column `mean_catch` with rolling mean of 3 yrs
filter(year > 2004) %>%
select(year, rgn_id, rgn_name, species, stock, stock_id, mean_catch, display_name) %>%
ungroup()
p <- ggplot(gf_data, aes(x = year, y = mean_catch, fill = display_name)) +
geom_bar(stat = "identity") +
facet_wrap(~rgn_name, scales = "free_y") +
theme_bw() +
theme(legend.text = element_text(size = 6),
legend.position = "below",
axis.text = element_text(size = 6),
axis.text.x = element_text(angle = 45))
plotly::ggplotly(p)Let’s look at total regional catch for each species (not stock)
#calculate total regional catch per species
species_catch <- gf_data %>%
group_by(species, year) %>%
summarize(sp_catch = sum(mean_catch, na.rm=T)) %>%
ungroup() %>%
group_by(year) %>%
mutate(yr_catch = sum(sp_catch),
catch_prop = sp_catch/yr_catch) %>%
ungroup() %>%
filter(year > 2004) p <- ggplot(species_catch, aes(x = year, y = catch_prop, fill = species)) +
geom_bar(stat = "identity") +
theme_bw() +
theme(legend.text = element_text(size = 6))
plotly::ggplotly(p)Clearly atlantic herring is making up the majority of catch! Atlantic herring is primarily a bait fishery, so we need to account for that since this goal is only measuring catch meant for human consumption. We adjust for this below.
Some of these species are harvested for food as well as other markets like pet food or bait. We want to make sure this goal captures catch meant for human consumption. We have data from NOAA that identifies the amount of catch per species, state and year meant for food, bait, and other markets. This data was cleaned in prop_catch_food_bait.Rmd.
prop_data <- read_csv("data/fish_catch_food_prop_rgn.csv")
toolbox_data <- gf_data %>%
left_join(prop_data) %>%
mutate(prop = ifelse(is.na(prop),1,prop),
mean_catch_times_prop = mean_catch*prop) %>%
filter(!market %in% c("BAIT", "NO MARKET", "CANNED PET FOOD"), #remove bait, pet food and no market records.
!is.na(mean_catch)) %>% #remove records with no catch (don't need them)
select(-market, -pounds_live_by_market, -total_pounds_live, -prop)